Advances in Information Retrieval
31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6-9, 2009. Proceedings

Mohand Boughanem, Catherine Berrut, Josiane Mothe, Chantal Soule-Dupuy

Items from 1 to 20 out of 25 results

chapter

Cover Coefficient-Based Multi-document Summarization

Gonenc Ercan, Fazli Can

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 670-674

In this paper we present a generic, language independent multi-document summarization system forming extracts using the cover coefficient concept. Cover Coefficient-based Summarizer (CCS) uses similarity between sentences to determine representative sentences. Experiments indicate that CCS is an efficient algorithm that is able to generate quality summaries online.

chapter

A Practitioner’s Guide for Static Index Pruning

Ismail Sengor Altingovde, Rifat Ozcan, Özgür Ulusoy

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 675-679

We compare the term- and document-centric static index pruning approaches as described in the literature and investigate their sensitivity to the scoring functions employed during the pruning and actual retrieval stages.

chapter

Revisiting N-Gram Based Models for Retrieval in Degraded Large Collections

Javier Parapar, Ana Freire, Álvaro Barreiro

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 680-684

The traditional retrieval models based on term matching are not effective in collections of degraded documents (output of OCR or ASR systems for instance). This paper presents a n-gram based distributed model for retrieval on degraded text large collections. Evaluation was carried out with both the TREC Confusion Track and Legal Track collections showing that the presented approach outperforms in...

chapter

A Simple Linear Ranking Algorithm Using Query Dependent Intercept Variables

Nir Ailon

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 685-690

The LETOR website contains three information retrieval datasets used as a benchmark for testing machine learning ideas for ranking. Participating algorithms are measured using standard IR ranking measures (NDCG, precision, MAP). Similarly to other participating algorithms, we train a linear classifier. In contrast, we define an additional free benchmark variable for each query. This allows expressing...

chapter

Measurement Techniques and Caching Effects

Stefan Pohl, Alistair Moffat

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 691-695

Overall query execution time consists of the time spent transferring data from disk to memory, and the time spent performing actual computation. In any measurement of overall time on a given hardware configuration, the two separate costs are aggregated. This makes it hard to reproduce results and to infer which of the two costs is actually affected by modifications proposed by researchers. In this...

chapter

On Automatic Plagiarism Detection Based on n-Grams Comparison

Alberto Barrón-Cedeño, Paolo Rosso

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 696-700

When automatic plagiarism detection is carried out considering a reference corpus, a suspicious text is compared to a set of original documents in order to relate the plagiarised text fragments to their potential source. One of the biggest difficulties in this task is to locate plagiarised fragments that have been modified (by rewording, insertion or deletion, for example) from the source text. ...

chapter

Exploiting Visual Concepts to Improve Text-Based Image Retrieval

Sabrina Tollari, Marcin Detyniecki, Christophe Marsala, Ali Fakeri-Tabrizi, more

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 701-705

In this paper, we study how to automatically exploit visual concepts in a text-based image retrieval task. First, we use Forest of Fuzzy Decision Trees (FFDTs) to automatically annotate images with visual concepts. Second, using optionally WordNet, we match visual concepts and textual query. Finally, we filter the text-based image retrieval result list using the FFDTs. This study is performed in the...

chapter

Choosing the Best MT Programs for CLIR Purposes – Can MT Metrics Be Helpful?

Kimmo Kettunen

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 706-712

This paper describes usage of MT metrics in choosing the best candidates for MT-based query translation resources. Our main metrics is METEOR, but we also use NIST and BLEU. Language pair of our evaluation is English → German, because MT metrics still do not offer very many language pairs for comparison. We evaluated translations of CLEF 2003 topics of four different MT programs with MT metrics and...

chapter

Entropy-Based Static Index Pruning

Lei Zheng, Ingemar J. Cox

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 713-718

We propose a new entropy-based algorithm for static index pruning. The algorithm computes an importance score for each document in the collection based on the entropy of each term. A threshold is set according to the desired level of pruning and all postings associated with documents that score below this threshold are removed from the index, i.e. documents are removed from the collection. We compare...

chapter

Representing User Navigation in XML Retrieval with Structural Summaries

Mir Sadek Ali, Mariano P. Consens, Birger Larsen

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 719-723

This poster presents a novel way to represent user navigation in XML retrieval using collection statistics from XML summaries. Currently, developing user navigation models in XML retrieval is costly and the models are specific to collected user assessments. We address this problem by proposing summary navigation models which describe user navigation in terms of XML summaries. We develop our proposal...

chapter

ESUM: An Efficient System for Query-Specific Multi-document Summarization

C. Ravindranath Chowdary, P. Sreenivasa Kumar

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 724-728

In this paper, we address the problem of generating a query-specific extractive summary in a an efficient manner for a given set of documents. In many of the current solutions, the entire collection of documents is modeled as a single graph which is used for summary generation. Unlike these approaches, in this paper, we model each individual document as a graph and generate a query-specific summary...

chapter

Using WordNet’s Semantic Relations for Opinion Detection in Blogs

Malik Muhammad Saad Missen, Mohand Boughanem

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 729-733

The Opinion Detection from blogs has always been a challenge for researchers. One of the challenges faced is to find such documents that specifically contain opinion on users’ information need. This requires text processing on sentence level rather than on document level. In this paper, we have proposed an opinion detection approach. The proposed approach focuses on above problem by processing documents...

chapter

Improving Opinion Retrieval Based on Query-Specific Sentiment Lexicon

Seung-Hoon Na, Yeha Lee, Sang-Hyob Nam, Jong-Hyeok Lee

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 734-738

Lexicon-based approaches have been widely used for opinion retrieval due to their simplicity. However, no previous work has focused on the domain-dependency problem in opinion lexicon construction. This paper proposes simple feedback-style learning for query-specific opinion lexicon using the set of top-retrieved documents in response to a query. The proposed learning starts from the initial domain-independent...

chapter

Automatically Maintained Domain Knowledge: Initial Findings

Deirdre Lungley, Udo Kruschwitz

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 739-743

This paper explores the use of implicit user feedback in adapting the underlying domain model of an intranet search system. The domain model, a Formal Concept Analysis (FCA) lattice, is used as an interactive interface to allow user exploration of the context of an intranet query. Implicit user feedback is harnessed here to surmount the difficulty of achieving optimum document descriptors, essential...

chapter

A Framework of Evaluation for Question-Answering Systems

Sarra Ayari, Brigitte Grau

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 744-748

Evaluating complex system is a complex task. Evaluation campaigns are organized each year to test different systems on global results, but they do not evaluate the relevance of the criteria used. Our purpose consist in modifying the intermediate results created by the components and inserting the new results into the process, without modifying the components. We will describe our framework of glass-box...

chapter

Combining Content and Context Similarities for Image Retrieval

Xiaojun Wan

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 749-754

CBIR has been a challenging problem and its performance relies on the underlying image similarity (distance) metric. Most existing metrics evaluate pairwise image similarity based only on image content, which is denoted as content similarity. In this study we propose a novel similarity metric to make use of the image contexts in an image collection. The context of an image is built by constructing...

chapter

Investigating the Global Semantic Impact of Speech Recognition Error on Spoken Content Collections

Martha Larson, Manos Tsagkias, Jiyin He, Maarten Rijke

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 755-760

Errors in speech recognition transcripts have a negative impact on effectiveness of content-based speech retrieval and present a particular challenge for collections containing conversational spoken content. We propose a Global Semantic Distortion (GSD) metric that measures the collection-wide impact of speech recognition error on spoken content retrieval in a query-independent manner. We deploy our...

chapter

Supervised Semantic Indexing

Bing Bai, Jason Weston, Ronan Collobert, David Grangier

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 761-765

We present a class of models that are discriminatively trained to directly map from the word content in a query-document or document- document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, pol- ysemy). However, unlike LSI our models are trained with a supervised signal directly on the task of interest, which we argue...

chapter

Split and Merge Based Story Segmentation in News Videos

Anuj Goyal, P. Punitha, Frank Hopfgartner, Joemon M. Jose

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 766-770

Segmenting videos into smaller, semantically related segments which ease the access of the video data is a challenging open research. In this paper, we present a scheme for semantic story segmentation based on anchor person detection. The proposed model makes use of a split and merge mechanism to find story boundaries. The approach is based on visual features and text transcripts. The performance...

chapter

Encoding Ordinal Features into Binary Features for Text Classification

Andrea Esuli, Fabrizio Sebastiani

Lecture Notes in Computer Science > Advances in Information Retrieval > Posters > 771-775

We propose a method by means of which supervised learning algorithms that only accept binary input can be extended to use ordinal (i.e., integer-valued) input. This is much needed in text classification, since it becomes thus possible to endow these learning devices with term frequency information, rather than just information on the presence/absence of the term in the document. We test two different...

Part:
Posters
Series:
Lecture Notes in Computer Science

Publication date

Set your own date range

Keywords

AUTOMATED TEXT SUMMARIZATION (1)
BLOGS (1)
COHERENT AND NON-REDUNDANT SUMMARIES (1)
COVER COEFFICIENT CONCEPT (1)
DOCUMENT RANKING (1)
DOMAIN MODELLING (1)
EFFICIENT SUMMARIZATION (1)
FORMAL CONCEPT ANALYSIS (1)
FRAMEWORK (1)
GLASS-BOX EVALUATION (1)
IMPLICIT RELEVANCE FEEDBACK (1)
INFORMATION EXTRACTION (1)
INFORMATION RETRIEVAL (1)
MULTI-DOCUMENT SUMMARIZATION (1)
NODE(TAG=, PARTS=[NODE(TAG=I, PARTS=[N])])-GRAMS (1)
OPINION DETECTION (1)
PLAGIARISM DETECTION (1)
QUERY REFINEMENT (1)
QUESTION-ANSWERING SYSTEM (1)
REFERENCE CORPUS (1)
RELATIONAL DATABASE (1)
RELEVANCE OF CRITERIA (1)
SEMANTIC INDEXING (1)
SEMANTIC RELATEDNESS (1)
SUPERVISED (1)
TEXT REUSE (1)
WORDNET (1)
more

INFONA - science communication portal

Advances in Information Retrieval
31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6-9, 2009. Proceedings

Cover Coefficient-Based Multi-document Summarization

A Practitioner’s Guide for Static Index Pruning

Revisiting N-Gram Based Models for Retrieval in Degraded Large Collections

A Simple Linear Ranking Algorithm Using Query Dependent Intercept Variables

Measurement Techniques and Caching Effects

On Automatic Plagiarism Detection Based on n-Grams Comparison

Exploiting Visual Concepts to Improve Text-Based Image Retrieval

Choosing the Best MT Programs for CLIR Purposes – Can MT Metrics Be Helpful?

Entropy-Based Static Index Pruning

Representing User Navigation in XML Retrieval with Structural Summaries

ESUM: An Efficient System for Query-Specific Multi-document Summarization

Using WordNet’s Semantic Relations for Opinion Detection in Blogs

Improving Opinion Retrieval Based on Query-Specific Sentiment Lexicon

Automatically Maintained Domain Knowledge: Initial Findings

A Framework of Evaluation for Question-Answering Systems

Combining Content and Context Similarities for Image Retrieval

Investigating the Global Semantic Impact of Speech Recognition Error on Spoken Content Collections

Supervised Semantic Indexing

Split and Merge Based Story Segmentation in News Videos

Encoding Ordinal Features into Binary Features for Text Classification

Filter options

Publication date

Keywords

INFONA - science communication portal

Advances in Information Retrieval 31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6-9, 2009. Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Advances in Information Retrieval
31th European Conference on IR Research, ECIR 2009, Toulouse, France, April 6-9, 2009. Proceedings